α-min: A Compact Approximate Solver For Finite-Horizon POMDPs
نویسندگان
چکیده
In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers. One measure of simplicity for POMDP value functions is the number of α-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function, which can require a very large number of α-vectors. This paper studies methods that allow the user to explore the tradeoff between the accuracy of the value function and the number of αvectors. Building on previous point-based POMDP solvers, this paper introduces a new algorithm (αmin) that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of α-vectors. At each time-step, α-min calculates α-vectors to greedily minimize the gap between current upper and lower bounds of the value function. In doing so, good upper and lower bounds are quickly reached allowing a good approximation of the problem with few α-vectors. Experimental results show that α-min provides good approximate solutions given a fixed number of α-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger.
منابع مشابه
Three New Algorithms to Solve N-POMDPs
In many fields in computational sustainability, applications of POMDPs are inhibited by the complexity of the optimal solution. One way of delivering simple solutions is to represent the policy with a small number of α-vectors. We would like to find the best possible policy that can be expressed using a fixed number N of α-vectors. We call this the N-POMDP problem. The existing solver α-min app...
متن کاملOptimal Control of Partiality Observable Markov Processes over a Finite Horizon
This report presents an approach to find exact solution of optimal control of POMDPs (Partiality Observable Markov Decision Process) over a finite horizon under having a few reasonable assumptions. The approach only considers finite-state Markov processes. By comparing MDPs and PODMPs from optimal control policies point of view, it will be demonstrated that solving POMDPs is harder than solving...
متن کاملBounded Dynamic Programming for Decentralized POMDPs
Solving decentralized POMDPs (DEC-POMDPs) optimally is a very hard problem. As a result, several approximate algorithms have been developed, but these do not have satisfactory error bounds. In this paper, we first discuss optimal dynamic programming and some approximate finite horizon DEC-POMDP algorithms. We then present a bounded dynamic programming algorithm. Given a problem and an error bou...
متن کاملOrr Sommerfeld Solver Using Mapped Finite Di?erence Scheme for Plane Wake Flow
Linear stability analysis of the three dimensional plane wake flow is performed using a mapped finite di?erence scheme in a domain which is doubly infinite in the cross–stream direction of wake flow. The physical domain in cross–stream direction is mapped to the computational domain using a cotangent mapping of the form y = ?cot(??). The Squire transformation [2], proposed by Squire, is also us...
متن کاملHistory-Based Controller Design and Optimization for Partially Observable MDPs
Partially observable MDPs provide an elegant framework for sequential decision making. Finite-state controllers (FSCs) are often used to represent policies for infinite-horizon problems as they offer a compact representation, simple-toexecute plans, and adjustable tradeoff between computational complexity and policy size. We develop novel connections between optimizing FSCs for POMDPs and the d...
متن کامل